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ABSTRACT 

The  study  of  coding  for  constant -data- rate  systems,  begun  in  Part  I,  is 
extended  by  considering  the  use  of  multiple-error-correcting  codes.  The 
principle  of  the  Wagner  code  is  used  to  construct  two  new  multiple-error- 
correcting  codes,  the  Hamming- Wagner  code  and  the  syllabified  Wagner 
code.  The  performances  of  these  codes,  and  of  the  Reed  code  (a  multiple- 
error-correcting  code  not  of  the  Wagner  type)  are  compared.  Of  the 
three  Wagner-type  codes,  the  ordinary  Wagner  code  is  best  for  short 
words,  the  Hamming-Wagner  for  medium  length  and  long  words,  and 
the  syllabified  Wagner  for  very  long  words.  The  Reed  three-error- 
correcting  code  (as  yet  only  applicable  to  a few  isolated  values  of  word 
length)  outperforms  both  the  Hamming-Wagner  and  syllabified  Wagner 
codes. 
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CODING  FOR  CONSTANT-DATA-RATE  SYSTEMS 
II.  MULTIPLE-ERROR-CORRECTING  CODES 


A.  INTRODUCTION 

In  Part  I of  this  paper1  we  introduced  the  Wagner  code,  a new  means  of  correcting 

single  errors  in  sequences  of  binary  digits.  (We  call  such  sequences  words.  ) It  differs  from 
2 

the  Hamming  code  by  being  likely  rather  than  certain  to  correct  single  errors.  However,  it 
uses  only  one  check  digit,  whereas  the  Hamming  code  uses  several  (the  number  depending  on  the 
number  of  message  digits).  In  communication  systems  with  a fixed  word  length  (constant-data- 
rate)  the  economy  of  the  Wagner  code  in  the  use  of  check  digits  can  offset  the  disadvantage  of  not 
correcting  all  single  errors.  We  found  that  for  such  systems  short  Wagner-coded  words  have 
much  smaller  probabilities  of  error  than  the  corresponding  Hamming- coded  words. 

We  now  consider  the  performance  of  multiple-error-correcting  codes  in  constant- 
data-rate  systems.  Again  we  have  codes  like  Hamming's,  which  correct  errors  by  using  alge- 
braic relations  between  received  message  and  check  digits,  and  codes  like  Wagner's,  which 
require  stored  a posteriori  probabilities  suitably  computed  by  the  receiver,  as  well  as  the 
received  word.1  (Some  may  object  to  the  word  "code"  as  applied  to  the  Wagner  scheme,  since 
information  other  than  the  received  word  is  required.  However,  we  feel  that  the  phrase  "Wagner 
code"  is  justified  by  its  linguistic  convenience.)  There  are  also  "mixed"  codes,  which  correct 
some  errors  by  using  only  algebraic  relations  between  the  received  digits,  and  other  errors  by 
using  the  Wagner  scheme.  The  Hamming-Wagner  code  described  in  Sec.  B is  such  a "mixed" 
code.  We  also  examine  a syllabified  Wagner  code,  in  which  each  word  is  split  up  into  separately 
Wagner-coded  subwords,  and  the  class  of  multiple-error-correcting  codes  recently  developed 
by  I.  S.  Reed  and  others.^  Our  conclusions  are  summarized  in  Sec.E. 


B.  THE  HAMMING-WAGNER  CODE 


We  consider  systems  such  that  each  digit  (one  of  two  electrical  signals,  Xj  (t)  and 
x2(t)  of  duration  T and  bandwidth  W,  TW  » 1)  is  corrupted  in  the  channel  by  the  addition  of 
white  Gaussian  noise.  If  y(t)  is  the  received  signal,  Xj  (t)  or  x2  (t)  is  chosen  as  the  transmitted 
signal  according  as 


or 


*1  (t)y(t)dt 


X2  (t)y(t)dt 


(1) 


(2) 


is  the  larger.  It  was  shown  in  Part  I,  with  which  familiarity  is  assumed,  that  the  correlations 
Zj  and  z2  are  monotone  functions  of  the  a posteriori  probabilities  p(Xj/y)  and  p(x2/y).  Moreover, 
they  are  more  convenient  quantities  for  calculation,  since  for  suitable  Xj  and  x2>  they  are 

"The  reader  Is  reminded  that  the  shorter  a digit,  the  greater  Its  probability  of  error.1 
**The  a posteriori  probability  that  If  y Is  received,  x was  sent,  is  denoted  by  pix/y). 
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independent  Gaussian  random  variables,  with  means  Cj  and  c2(Cj  > c2),  and  standard  deviations 
Vj  and  c^. 

The  Wanner  code  (analyzed  in  Part  I)  operates  as  follows:  the  values  of  Zj  and  z2 
corresponding  to  each  digit  of  a received  word  are  stored  in  a memory  for  the  duration  of  the 
word.  The  last  digit  of  each  transmitted  word  was  chosen  to  make  the  sum  of  all  the  digits  even 
(parity  check).  The  sum  of  all  the  tentatively  identified  digits  will  be  odd  if  the  received  word 
differs  from  the  corresponding  transmitted  word  in  an  odd  number  of  digits.  However,  whenever 
the  sum  is  odd,  the  receiver  assumes  that  the  error  is  in  only  on.'  digit,  and  alters  the  digit  for 
which  the  stored  correlations  differ  by  the  smallest  amount.  We  found  that  the  probability  of 
error  per  Wagner-coded  word  of  m message  digits  is 


Pw=  l-qm+1(a)-  IIm  + 1(a>  . 


(1-29) 


where 


C1  ~C2 


j2  + 


o 


and 


P(<0  = T (1  - erf  a) 


q(a)  = if1  + erf  a) 


The  quantity  Iln(a)  is  the  multiple  integral 


nn(a)=  TZm  f e*Pt-<*n-a)2|dxn  f " «p  [- (x  -a)2]dxn_  j ... 

(\p)  Jo  Jo 

, r* 

ld»2  J 

Jo 


J ' 3 exp  [-  (x2  - a)Zjdx2  j 2 exp  [-  (Xj  + a)2]dx1 


(3) 

(1-7) 

(1-14) 


(M2) 


which  can  be  reduced  by  repeated  integration  by  parts  to  the  recurrence  relation 

nn<a>  = ICl>  Vl<a>-^C2>nn-2<a>  ••  + <S>  °2 


+ (-^(?)[l,H-InH] 


where 


(1-22) 


I (a)  = 

n 47 


r 

m/Q 


[erf  (x  - a))"'1  exp[-  (x  * a)2]dx 


(1-23) 


‘Equation  numbers  preceded  by  the  Roman  numeral  I refer  to  correspondingly  numbered  equations  in  Part  I. 


2 


UNCLASSIFIED 


UNCLASSIFIED 


It  is  shown  in  Appendix  B that  Eq.  (1-22)  can  be  reduced  to  the  sum 


• fn  <"'i1)("1)i+1  ¥•> 


(4) 


All  the  terms  of  Eq.(4)  are  positive  for  values  of  a in  the  range  of  interest*  {a  - 1.0  to  3.0,  say), 
so  that  Eq.  (4)  is  much  more  suited  for  numerical  work  than  Eq.  (1-22)  when  n is  large. 

We  now  extend  the  principle  of  the  Wagner  code  to  a double-error-correcting  code. 
The  following  procedure  appears  best  as  a first  attempt.  Further  check  digits  are  added  to  the 
Wagner-coded  word;  these  reveal  double  as  well  as  single  errors.  If  a double  error  is  detected, 
we  change  the  two  digits  of  the  stored  word  with  the  smallest  correlator  differences.  If  a single 
error  is  detected,  we  change  only  the  smallest  correlator  difference. 

The  success  of  this  scheme  requires  a system  of  check  digits  which  indicates  both 
single  and  double  errors,  and  further  allows  them  to  be  distinguished.  The  geometrical  model 
of  message  space  (see  Appendix  A)  is  well  suited  for  examining  the  possibility  of  setting  up  such 
check  digits.  Referring  to  Fig.  1,  we  see  that  if  both  single  and  double  errors  in  possible  trans- 

„ _ mitted  points  (such  as  P,  and  P,)  are  to  be  de- 

e,  s,  D s,  ^ 12' 

• O O- O • tectable,  and  if  single  errors  are  to  be  distin- 

guishable from  double  errors,  every  such  pair 

Fig.  1.  Configuration  of  points  in  massage  space  Qf  points  must  be  separated  by  a distance  of  4 
between  two  possible  transmitted  messoges  P,  and  P*.  „ _ 

1 1 or  more.  For  then,  a single  error  in  Pj  sends 

it  to  a neighboring  point  like  Sj , where  it  can  be  stated  with  certainty  to  have  come  either  from 

Pj  by  a change  in  one  digit,  or  from  some  other  possible  transmitted  message  by  a change  in 

three  or  more  digits.  Similarly,  a single  error  in  P^,  sends  it  to  a neighbor  like  S^,.  On  the 

other  hand,  a double  error  in  either  Pj  or  P,  may  correspond  to  a received  point  like  D,  at  a 

distance  of  2 from  both.  Unless  there  are  at  least  three  points  between  all  pairs  of  possible 

transmitted  points,  a double  error  in  Pj  (say)  is  indistinguishable  from  a single  error  in  P^  (or 

some  other  transmitted  point),  so  that  we  do  not  know  whether  to  correct  one  or  two  digits  in  the 

received  word. 

Now  in  a Hamming  single-error-correcting,  double-error-detecting  code,  ail  trans- 
mitted messages  are  separated  by  at  least  a distance  of  4 (see  Appendix  A).  This  is  just  the 
separation  required  for  successful  operation  of  a Wagner  code  that  corrects  both  single  and 
double  errors.  Thus  the  number  oi  check  digits  needed  to  correct  all  single  errors  before  apply- 
ing the  Wagner  procedure  to  double  errors  is  the  same  as  the  number  required  to  apply  the 
Wagner  procedure  to  both  single  and  double  errors.  This  suggests  a "Hamming-Wagner"  code 
of  the  "mixed"  type  mentioned  in  Sec.  A,  which  is  obviously  better  than  the  corresponding 
"Wagner-Wagner"  code. 

We  thus  arrive  at  a code  that  is  like  the  Hamming  single-error-correcting,  double- 
error-detecting code,  except  that  if  the  extra  check  digit  indicates  a double  error,  we  change 
the  two  digits  with  the  smallest  correlator  differences.  The  analysis  of  this  Hamming-Wagner 


"It  It  inevitable  that  tome  higher-order  errort  will  be  miitaken  for  double  errort  and  will  remain  uncorrected 
after  applying  the  Wagner  procedure.  However,  thit  it  no  worte  than  leaving  them  uncorrected  in  the  firtt 
place.  (See  Appendix  A.) 
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code  is  completely  analogous  to  that  of  the  simple  Wagner  code. 

The  probability  of  error  per  Hamming -Wagner-coded  word  is 

PHW  = 1 -<im+k+1  (°)  — <m  ♦ k ♦ !)  qm+k  (»)  p(°)-2nm+k+i  (a)  . (5) 

where  a,  p (a),  and  q (a)  have  already  been  defined.  The  quantity  k is  the  number  of  check  digits 
required  by  the  Hamming  single-error-correcting  code*  The  quantity  2[In  (a)  [in  analogy  to 
Eq.  (1-12)]  is  the  multiple  integral 

2nn(a)=  £ exp[-(xn-o)2]dxn  n exp  [-  (xn_  , - a)2]dxn_  f . . . 


f 4 e xp  [-  (Xj  — o)2  ] dXj  f 3exp[-  (x2  + a)2]dx2  f 2exp[-(xj  +a)2}dx. 
* * o Jo  Jo 

Repeated  integration  by  parts  reduces  Eq  (6)  to  the  recurrence  relation 

2n„  <•>  - i Ci>  2nn.i  (°)  - -p  Cz)  2nn.2«o  . . . 


(6) 


(7) 


where 


2.  , , 2 

*»w = 7-. 


= r 

* Jo 


[erf  (x  — a)]n  Z exp  [-  (x  + a)^][erf  (x  + a)  - erf  a]dx 


(8) 


Equation  (7)  can  be  further  reduced  to  the  sum 

2nn(a)  = z (-2)(—  l,i2Ii(a) 

n 2n  i=2  1 £ 


(9) 


In  complete  analogy  to  Eq.  (4).  (See  Appendix  B. ) 

Pjj^(1.35)  and  Pjj^(1.80)  are  tabulated  in  Table  I for  various  values  of  m,  together 
with  the  corresponding  probabilities  of  error  for  uncoded,  Hamming-coded,  and  Wagner-coded 
words.  The  values  of  a used  in  computing  P^,  Pjj,  and  PW  are  chosen  so  that  all  words  (message 
digits  plus  check  digits)  have  the  same  duration,  as  required  in  a constant-data-rate  system. 

Thus 

d / \ i m , , / m + k + 1 

PU(oU)=1.-q  (V  • °U  = V m a (10) 

PH(aH)s  1"qm+k(aH,_(m  + k)qm+k"1<0H,P(aH)  * *H  = a (11) 

PW  (aW)lU  qm  (oW*  “ nm+l  (aW*  ’ *W  = + 1 ° (12) 


*For  cn  « 5 to  11,  k = 4;  for  cn  = 1 2 to  26,  k = 5j  etc . ^ 
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TABLE  I 


COMPARISON  OF  HAMMING, 

WAGNER.  AND  HAM  MING -WAGNER  CODES 

(a)  o = 1.35 

m 

PU 

PH 

PW 

P 

rHW 

10 

0.093 

0.044 

0.030 

0.037 

11 

0.111 

0.050 

0.038 

0.043 

12 

0.110 

0.065 

0.038 

0.057 

13 

0.128 

0.073 

0.047 

0.064 

14 

0.146 

0.081 

0.0  56 

0.071 

15 

0.165 

0.090 

0.067 

0.079 

16 

0.183 

0.098 

0.079 

0.087 

17 

0.202 

0.107 

0.092 

0.096 

16 

0.220 

0.116 

0.105 

0.104 

19 

0.239 

0.126 

0.119 

0.113 

20 

0.257 

0.135 

0.134 

0.122 

21 

0.275 

0.145 

0.149 

0.132 

(b)  a = 1.80 

m 

pu 

PH 

PW 

PHW 

10 

0.0091 

0.0016 

0.00067 

0.00063 

11 

0.0117 

0.0018 

0.00095 

0.00076 

12 

0.0109 

0.0025 

0.00081 

0.00107 

13 

0.0135 

0.0029 

0.00111 

0.00124 

14 

0.0163 

0.0033 

0.0015 

0.0014 

15 

0.0193 

0.0037 

0.0019 

0.0016 

16 

0.0224 

0.0042 

0.0024 

0.0019 

17 

0.0258 

0.0046 

0.0030 

0.0021 

18 

0.0292 

0.0051 

0.0037 

0.0024 

19 

0.0328 

0.0057 

0.0044 

0.0026 

20 

0.0364 

0.0062 

0.0052 

0.0029 

21 

0.0401 

0.0068 

0.0061 

0.0032 

22 

0.0439 

0.0074 

0.0070 

0.0036 

23 

0.0478 

0.0080 

0.0080 

0.0039 

24 

0.0518 

0.0086 

0.0091 

0.0043 

Values  of  a are 

for  the  Hamming -Wagner  code 

m - number  of  m ssage  digits 
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Table  I shows  that  for  o = 1.35,  a very  noisy  case,  the  Hamming  code  becomes  better 
than  the  Wagner  code  at  m = 21.  For  a = 1.80,  which  corresponds  to  much  less  r.oise,  the 
Hamming  code  surpasses  the  Wagner  code  at  m = 24.  Thus  it  appears  that,  starting  with  some 
value  of  m between  25  and  30,  the  Hamming  code  is  better  than  the  Wagner  code  anywhere  in  the 
significant  range  of  a (neither  too  little  nor  too  much  noiseV  This  happens  for  the  reasons  given 
in  Part  I:  — (1)  the  ratio  k/m  decreases  with  increasing  m,  so  that  corresponding  values  of  a for 
the  two  codes  become  more  nearly  alike,  dissipating  the  advantage  of  the  Wagner  code's  economy 
in  the  use  of  check  digits,  and  (2)  the  conditional  probability  that  the  Wagner  code  corrects  single 
errors  decreases  as  m increases. 

We  see  from  the  table  that  the  Hamming -Wagner  code  is  consistently  better  than  the 
Hamming  code;  however,  the  percentage  improvement  is  greater  for  q = 1.80  than  for  the  noisier 
case  o = 1.35.  For  o = 1.35,  the  Hamming-Wagner  code  is  better  than  the  Wagner  code  for  all 
m > 17;  for  a = 1.80,  the  Hamming-Wagner  code  is  better  than  the  Wagner  code  for  all  m > 13. 
Thus,  while  the  Wagner  code  is  superior  to  the  Hamming  code  for  words  of  length  less  than  about 
20,  the  Hamming-Wagner  code  is  superior  to  either  of  these  codes  for  words  of  length  greater 
than  about  15?  The  Hamming-Wagner  code  works  better  in  low  noise  than  in  high  noise,  because 
(1)  proportionately  fewer  multiple  errors  are  of  order  higher  than  two,  and  (2)  the  conditional 
probability  of  correcting  double  errors  is  higher.  Since  this  conditional  probability  decreases 
as  m increases,  the  Hamming-Wagner  code  gradually  becomes  less  effective,  as  shown  in  the 
next  section. 


C.  THE  SYLLABIFIED  WAGNER  CODE 

Another  multiple-error-correcting  code  based  on  the  principle  of  the  Wagner  code  is 
the  syllabified  Wagner  code,  constructed  by  dividing  each  word  into  separately  Wagner-coded 
subwords  or  syllables.  Suppose  a word  with  m message  digits  is  divided  into  j syllables,  each 
containing  m = m.  + 1 digits,  where 

j 

m = £ m. 
i=  1 1 

Since  the  probability  that  a syllable  (regarded  as  a Wagner-coded  word)  is  correct  is 


d 1 (“)  + nn  (o) 


the  probability  of  error  for  a syllabified-Wagner-coded  word  is 

j n 

psw(mr  m2*  • • • > mj)  = 1 - [q  (»)  + nn  (a)] 


) 

2 m.  = m (13) 

i=l  1 


•The  Hammlrtg-Wogner  cod*  li  also  better  for  m = 10  and  1 1 . This  anomaly  is  do*  to  the  change  In  k from  4 to 
5 at  m = 12, 

**A  comparison  of  the  Hamming  and  Wagner  codes  in  the  rang*  m = 4 to  8 is  given  In  Part  i . 

fThe  value  of  m for  which  on*  cod*  becomes  better  than  another  is  somewhat  dependent  on  a.  (See  Table  I.) 
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It  follows  at  once  that  for  a given  number  of  syllables  Pg^  (m^.  m^,  ....  m^)  is  smallest  when 
the  syllables  have  equal  length  (or  as  nearly  equal  as  possible).  For  if  we  write  Eq.  (13)  as 

j-1  j-1 

Psw(mr  m2 mj)  ' 1 - f (m  - 2^  mj)  f (m.)  . (14) 

where 


f (m.)  = q l(o)  + Iln  (a) 


(15) 


we  obtain  after  differentiating  Eq.  (14)  with  respect  to  m^  and  equating  the  result  to  zero 


j-1 


r<mk)  f,<m-  mi> 

f(mk)  = j-1 

f (m-  2 m.) 

i=  1 1 


k = 1.  2 j-1 


(16) 


Consequently, 


f'  (m|)  f'(m2) 

f (m,)  ' f (m2)  - • • • - f (nv) 


{'  (mj 


(»7) 


so  that  Pg^(mj,  m^,  ....  m^)  is  smallest  when  all  the  f (m^)  are  equal,  i.e.,  when  all  the  sylla- 
bles are  of  equal  length  (or  as  nearly  equal  as  possible). 

If  too  few  syllables  are  used,  the  conditional  probability  of  correction  of  single  errors 
per  syllable  is  small  because  the  syllables  are  too  long.  If  too  many  syllables  are  used,  this 
conditional  probability  is  small  because  the  large  number  of  check  digits  leads  to  a small  value 
of  a.  (This  second  effect  is  partially  compensated  by  increased  multiple-error-correction  pos- 
sibilities.) The  optimum  number  of  syllables  is  a compromise  between  these  two  effects.  This 
optimum  number  is  not  necessarily  critical,  or  for  that  matter  the  same  for  all  o.  The  simple 
Wagner  code  (which  may  be  considered  a syllabified  Wagner  code  of  one  syllable)  is  clearly  best 
for  short  words.  At  about  m = 14,  division  into  two  syllables  is  better  than  the  simple  Wagner 
code.  At  m = 30,  divisions  into  three  and  four  syllables  are  about  equally  effective,  and  better 
than  divisions  into  more  or  fewer  syllables.  A syllable  length  of  seven  to  ten  digits  seems  to  be 
best. 

All  these  points  are  illustrated  in  Table  II,  which  compares  Pj^  and  Pg^  for  several 
values  of  m and  = 1.80.  The  table  also  shows  how  the  syllabified  Wagner  code  finally  sur- 
passes the  Hamming -Wagner  code  at  about  m = 80.  As  previously  mentioned,  this  is  due  to  the 
decrease  in  the  conditional  probability  that  the  Hamming -Wagner  code  corrects  double 

errors,  as  m increases.  This  decrease  in  *s  a^so  shown  in  Table  II.  The  formulas  used 

for  calculating  the  P's  are  the  same  as  those  in  Eqs.  (10),  (12),  and  (13)  with  the  o's  related  by 


7 m + k + 1 . 

m e 


■J 


m t k + 1 


HW  ’ “W  ‘ V m + 1 “HW  ’ “SW  " V m + j aHW  (18) 


-J 


m + k + 1 


where  m is  the  number  of  message  digits,  k the  number  of  check  digits,  and  j the  number  of 
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TABLE  It 

COMPARISON  OF  THE  HAMMING -WAGNER  AND  SYLLABIFIED  WAGNER  CODES 


m 

PHW 

CHW 

3 

PSW 

12 

0,00107 

0.77 

1 

0.0081 

2 

0.0087 

14 

0.00143 

0.75 

1 

0.00148 

2 

0.00144 

16 

0.00186 

0.73 

2 

0.00228 

18 

0.00236 

0.72 

2 

0.00322 

3 

0.00342 

20 

0.00292 

0.70 

2 

0.00438 

3 

0.00449 

22 

0.00356 

0.68 

2 

0.00577 

3 

0.00570 

24 

0.00426 

0.67 

3 

0.00700 

4 

0.00735 

30 

0.00730 

0.62 

3 

0.00981 

4 

0.00975 

5 

0.01006 

42 

0.0  146 

0.55 

5 

0.0200 

6 

0.0201 

54 

0.0244 

0.50 

6 

0.0318 

7 

0.0317 

72 

0.0448 

0.43 

8 

0.0468 

80 

0.0688 

0.38 

10 

0.0659 

j = 

number  of  syllables 

syllables.  The  quantity  Cj^fa)  is  given  by 

22Va) 

CHW(a)  = . ..  n-2,  . 2 . . 

n (n-1)  q (o)  p (a) 


n = m + k * 1 


D.  THE  REED  CODE 

We  now  examine  the  performance  in  a constant-data-rate  system  of  the  Reed  code,3 
the  only  known  example  of  a systematic  multiple-error-correcting  code.  First  we  describe  the 
code  briefly. 

The  Reed  code  is  applicable  only  when  the  total  number  of  digits  in  a word  is  a power 
of  2.  Corresponding  to  each  possible  word  length,  there  are  only  certain  possible  values  of  the 
order  to  which  errors  may  be  corrected.  For  each  of  these  possible  values,  the  number  of  mes- 
sage digits  is  determined.  This  feature  limits  the  application  of  the  code  in  communication  sys- 
tems, for  the  number  of  message  digits  >n  a word  (fixed  by  other  considerations)  may  not 
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correspond  to  a possible  choice  in  a Reed  code.  Table  III  shows  the  relations  between  the  number 
of  message  digits,  the  distance  between  possible  transmitted  messages  (see  Appendix  A),  and  the 
order  of  errors  corrected  and  detected  for  Reed-coded  words  of  2V  digits. 

As  indicated  in  Table  III,  the  Reed  code  not  only  corrects  all  errors  up  to  a given 
order,  but  also  detects  errors  of  that  order  plus  one.  This  feature  is  not  an  advantage  in  our 
case,  since  there  is  no  indication  of  the  correct  replacement  for  the  detected  mistaken  word.  In 
some  cases  the  Reed  code  corrects  errors  of  order  higher  than  indicated  in  Table  III?  but  no 

TABLE  III 


CHARACTERISTICS  OF  THE  REED  CODE 

Words  of  2W  digits 

Number  of  message 
digits 

Distance  between 
possible  messages 
transmitted 

Order  to  which 
errors  are 
corrected 

Order  of 
detected 
errors 

1 + V 

2v-l 

2W  ' 2 - 1 

2V’2 

1 + v + (vz) 

2V‘2 

2V-3-  1 

2W*  3 

1 + v + (p  + • 

- ♦ (?) 

2w'j 

2v*j-1  - 1 

2v-j-i 

2V  - 1 

2 

0 

1 

2W 

1 

0 

0 

analysis  of  this  phenomenon  has  been  made.  It  follows  that  the  probabilities  of  error  for  Reed- 
coded  words  calculated  below  are  only  upper  bounds,  albeit  probably  good  ones  because  of  the 
high  order  of  the  extra  corrected  errors. 

Table  IV  gives  a few  of  the  numbers  corresponding  to  the  formulas  of  Table  in.  The 
lack  of  flexibility  in  the  simultaneous  choices  of  number  of  message  digits  and  order  of  errors 
corrected  can  be  seen  at  once. 

The  encoded  message  is  obtained  by  multiplying  the  message  digits  by  certain  stand- 
ard sequences  of  n = 2W  digits,  and  then  adding  the  products  modulo  2.  Decoding  is  accomplished 

by  choosing  that  digit  given  by  the  majority  of  a set  of  sums  (again  modulo  2),  a given  standard 

3 

set  corresponding  to  each  message  digit.  Complete  details  are  to  be  found  in  Reed's  paper. 

Tables  III  and  IV  give  enough  information  on  the  number  of  check  digits,  the  order  of 
errors  corrected,  etc.  to  study  the  performance  of  the  Reed  code  in  a constant -data- rate  system. 
Table  V shows  corresponding  probabilities  of  error  per  word  for  a Reed  three-error-correcting 
code,  the  Hamming  single-error-correcting  code,  and  no  code  at  all.  The  formulas  for  and 
PH  are  the  same  as  those  given  in  Eqs.  (10)  and  (11):  the  probability  PR  is  given  by 


PR  ' 


3 

Z 

i=0 


m+k 

q 


pl(aR) 


(20) 
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TABLE  IV 


NUMERICAL  EXAMPLES  OF  THE 

REED  AND  HAMMING  CODES 

n 

m 

kR 

kH 

Order  to  which 
errors  are  corrected 

8 

4 

4 

3 

1 

16 

5 

11 

4 

3 

16 

1 1 

5 

4 

1 

32 

6 

26 

4 

7 

32 

16 

16 

5 

3 

32 

26 

6 

5 

1 

64 

7 

57 

4 

15 

64 

22 

42 

5 

7 

64 

42 

22 

6 

3 

64 

57 

7 

6 

1 

128 

99 

27 

7 

3 

256 

219 

37 

8 

3 

TABLE  V 


PROBABILITIES  OF  ERROR  FOR  UNCODED,  HAMMING-CODED, 
AND  REED-CODED  WORDS 

m 

“h 

PU 

PH 

PR 

16 

1.5 

0.1  14 

0.049 

0.047 

16 

2.0 

0.00953 

0.00112 

0.00042 

42 

1.5 

0.389 

0.195 

0.163 

42 

2.0 

0.0512 

0.0057 

0.0011 

99 

1.5 

0.754 

0.538 

0.449 

99 

2.0 

0.1558 

0.0259 

0.0041 

219 

1.5 

0.967 

0.899 

0.839 

219 

2.0 

0.354 

0.100 

0.018 
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where  kR  is  the  number  of  Reed  check  digits;  kjj  is  the  number  of  Hamming  check  digits 
required  for  m message  digits  (see  Table  IV).  The  relations  required  to  find  the  a’s  used  in 
Table  V are 


ym  + k„ 

ym  + kH 

m + kR  “H  (21) 

We  see  from  Table  V that  the  Reed  code  outperforms  the  Hamming  code,  even  for 
m = 16.  Thus  the  decrease  in  a produced  by  the  extra  check  digits  of  the  Reed  code  is  more  than 
compensated  by  the  ability  to  correct  all  double  and  triple  errors.  The  advantage  is  more  marked 
for  larger  a,  since  in  high  noise  many  more  errors  of  order  greater  than  three  are  introduced 
by  the  shortening  of  the  digit  length. 

In  Table  VI,  the  Reed  code  is  compared  at  three  of  its  allowed  values  of  m with  the 
best  of  the  Wagner  codes.  The  probability  of  error  for  uncoded  words  is  given  for  reference. 

TABLE  VI 


COMPARISON  OF  REED  CODE  WITH  HAMMING -WAGNER 
AND  SYLLABIFIED  WAGNER  CODES 

m 

°HW 

pu 

PHW 

PR 

16 

1.80 

0.0224 

0.0019 

0.0022 

42 

1.80 

0.1181 

0.0146 

0.0097 

m 

“sw 

j 

PU 

P 

SW 

PR 

99 

1.50 

10 

0.726 

0.359 

0.403 

99 

2.00 

10 

0.1379 

0.0151 

0.0029 

We  see  that  the  Wagner-type  codes  can  compete  with  the  Reed  code  in  high  noise. 

As  the  noise  decreases  or  m increases,  the  Reed  code  increases  its  advantage.  It  is  clear  that 
for  ordinary  communication  purposes,  the  Reed  code  would  be  better  for  long  words  than  any  of 
the  previously  considered  codes  if  the  restriction  on  the  allowed  values  of  m could  be  removed. 
Attempts  are  being  made  to  modify  the  code  to  give  a greater  number  of  possible  message  lengths, 
but  as  yet  no  systematic  multiple-error-correcting  code  suitable  for  an  arbitrary  number  of  mes- 
sage digits  has  been  found. 
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E.  SUMMARY  AND  CONCLUSIONS 

We  have  considered  the  use  of  several  types  of  binary  codes  in  communication  sys- 
tems, making  the  following  as  sumptions 

(1)  The  system  transmits  sequences  of  binary  digits  known  as  words.  If  any 
digit  is  altered,  the  information  carried  by  a word  is  lost.  Thus,  by  definition,  se- 
quences obtained  by  combining  words  are  not  themselves  words. 

(2)  The  transmitted  digits  are  one  of  two  electrical  signals  of  bandwidth  W and 
duration  T.  They  have  equal  energies  and  equal  a priori  probabilities. 

(3)  The  entire  coded  word  must  be  transmitted  in  a given  time,  regardless  of 
the  number  of  code  digits  required  to  check  the  message  digits.  (Assumption  of  con- 
stant data- rate. ) 

(4)  The  transmitted  digits  are  corrupted  by  the  addition  of  white  Gaussian  noise. 

They  are  determined  by  choosing  the  larger  of  two  independent  and  normally  distributed 
correlator  outputs.  The  time -bandwidth  product,  TW,  of  the  transmitted  signals  is  » 1, 
so  that  when  the  signal  length  is  changed  to  accommodate  different  numbers  of  check 
digits,  the  signal-to-noise  ratio  of  the  correlator  difference  voltage  is  proportional  to 
the  square  root  of  the  signal  length.  (Actually,  the  signal-to-noise  ratio  is  proportional 
to  VTW,  but  we  assume  that  W is  not  changed,  an  assumption  that  requires  TW  » 1.) 

By  the  best  code  (of  those  we  consider)  for  a given  word  length  and  channel  noise,  we 
mean  that  for  which  the  probability  of  error  per  word  is  smallest  (under  the  assumption  of  con- 
stant data-rate).  We  have  considered  the  following  systematic  codes:  — (1)  the  Hamming  single- 
error-correcting code,  (2)  the  Wagner  code,  (3)  the  Hamming-Wagner  code,  (4)  the  syllabified 
Wagner  code,  and  (5)  the  Reed  multiple -error-correcting  codes.  (The  Wagner,  Hamming-Wagner 
and  syllabified  Wagner  codes  are  introduced  in  this  paper.  ) For  short  words  (m<  about  15)  we 
find  that  the  Wagner  code  is  best  in  the  range  of  interest  (neither  too  little  nor  too  much  noise). 

As  m increases,  the  Wagner  code  is  surpassed  by  both  the  Hamming-Wagner  code  and  a syllab- 
ified Wagner  code  of  two  syllables.  For  values  of  m < about  80,  all  syllabified  Wagner  codes  are 
inferior  to  the  Hamming-Wagner  code.  For  larger  m,  the  conditional  probability  that  double 
errors  are  corrected  by  the  Hamming-Wagner  code  has  fallen  sufficiently  so  that  a syllabified 
Wagner  code  is  better.  Thus,  were  it  not  for  the  Reed  code  (which  is  only  applicable  for  a few 
word  lengths),  we  could  say  that  the  Wagner  code  is  best  for  short  words,  the  Hamming-Wagner 
code  for  medium  length  and  long  words,  and  the  syllabified  Wagner  code  for  very  long  words. 
However,  the  Reed  code  outperforms  the  Hamming-Wagner  code  at  m = 42  and  the  syllabified 
Wagner  code  at  m = 99  (except  in  excessively  high  noise).  Thus  for  large  m there  is  no  substi- 
tute for  multiple-error-correcting  codes  that  do  not  use  the  Wagner  principle.  We  can  safely 
say  that  if  the  Reed  code  can  be  generalized  to  apply  to  any  number  of  message  digits,  it  will  be 
the  best  code  except  for  short  words.  This  assumes  that  the  proportion  of  check  to  message 
digits  turns  out  to  be  comparable  to  that  of  the  present  Reed  code. 

The  numerical  work  reported  here  was  done  by  Mrs.  Elizabeth  Munro. 


*The  Hamming  code  surpasses  the  Wogner  code  for  m about  20,  but  Is  always  inferior  to  the  Hamming- 
Wagner  code. 
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APPENDIX  A 

The  salient  features  of  a geometrical  model  that  is  often  useful  as  a visual  aid  in 
coding  problems  are  given  in  the  following  discussion.  This  model  has  already  been  used  to 
advantage  by  Hamming.2 

The  set  of  possible  sequences  of  n binary  digits  can  be  represented  by  the  unit  cube 
in  a space  of  n dimensions.  The  sequences  are  the  vertices  of  the  cube,  and  the  distance  between 
two  vertices  is  defined  as  the  number  of  binary  digits  in  which  the  corresponding  sequences 
differ.  The  set  of  all  points  at  a distance  <k  from  a given  point  is  called  the  sphere  of  radius  k 
about  that  point.  The  volume  of  a sphere  of  radius  k,  defined  as  the  number  of  points  in  the 
sphere,  is 

k „ • 

2 (")  • 
i=0  1 

It  can  be  seen  that  if  we  choose  any  set  of  mutually  exclusive  volumes  in  message 
space,  designate  one  point  in  each  volume  as  a possible  transmitted  message,  and  identify  all 
other  points  in  the  volume  with  this  point,  we  have  constructed  a model  of  an  error-correcting 
code.  In  particular,  if  the  volumes  are  spheres  of  radius  k,  we  have  constructed  a code  that 
will  correct  any  number  of  errors  < k. 

If  the  message  space  is  divided  into  spheres  of  radius  one,  we  have  the  geometrical 
model  of  the  Hamming  single-error-correcting  code.  Unless  n is  of  the  form  2l  - 1 , there  are 
points  that  do  not  belong  to  any  sphere.  (If  n = 21  - 1 , the  space  can  be  fully  packed  by  2n~l 
spheres  of  radius  one  and  volume  2l. ) Points  not  in  any  sphere  are  said  to  be  in  limbo.  They 
represent  messages  that  cannot  become  one  of  the  set  of  possible  transmitted  messages  by 
correction  of  only  one  digit.  For  such  a point,  the  Hamming  parity  checks  call  for  a change  in 
a digit  whose  order  number  is  greater  than  the  length  of  the  sequence. 

Figure  2 illustrates  a Hamming  single -error- correcting  code  for  a space  of 
7 = 23  - 1 dimensions.  The  cube  has  128  vertices,  and  can  be  fully  packed  by  16  spheres  of 
radius  1 and  volume  8,  the  centers  of  which  correspond  to  the  16  possible  sequences  of  4 binary 
digits.  The  centers  of  the  16  spheres  are  circled.  Note  that  there  are  no  points  not  in  some 
sphere. 

Figure  3 illustrates  the  Hamming  code  for  6 (not  of  the  form  2*  - 1)  dimensions. 

The  centers  of  the  8 spheres  are  circled.  Note  that  there  are  8 points  not  in  any  sphere.  These 
points  (enclosed  in  squares)  are  the  limbo.  The  dotted  outlines  in  Figs.  2 and  3 indicate  typical 
spheres  in  each  space. 

Double-error-correcting  codes  of  dimension  less  than  90  must  have  a limbo.  For  if 
such  a code  has  no  limbo,  the  volume  V of  the  message  sphere  (of  radius  2)  must  divide  the 
volume  2n  of  the  whole  space.  Thus  the  equation 

V = 1 + n + (")  = j (nZ  + n + 2)  = 2k  (A-l) 

must  be  satisfied  for  integral  n and  k.  By  inspection,  we  find  that  the  only  solutions  of  Eq.  (A-l) 
for  n < 90  are  n = 1,  2,  and  5.  The  first  two  values  are  meaningless,  and  the  third  is  the  trivial 


13 

UNCLASSIFIED 


UNCLASSIFIED 


two-error-correcting  code  consisting  of  the  two  points  00000  and  11111  (i.e.,  one  message  digit 
and  four  check  digits). 

The  geometrical  model  appropriate  to  the  Hamming  single-error-correcting,  double- 
error-detecting code  and  the  Hamming-Wagner  code  is  obtained  by  adding  an  extra  dimension  to 
configurations  such  as  those  of  Figs.  2 and  3.  This  extra  dimension  corresponds  to  the  extra 
check  digit  required  to  detect  double  errors.  Some  errors  involving  three  or  more  digits  violate 
this  extra  parity  check  and  some  do  not.  it  is  possible  to  calculate  how  many  errors  of  each 
order  violate  the  parity  check  and  how  many  do  not,  but  this  is  hardly  worth  while,  since  those 
that  violate  the  check  are  indistinguishable  from  double  errors  and  those  that  do  not  are  indistin- 
guishable from  single  errors  or  no  errors  at  all. 

Mr.  Oliver  Selfridge  has  helped  greatly  in  preparing  Appendix  A. 
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APPENDIX  B 


The  transition  from  Eq.  (1-22)  to  Eq.  (4)  is  derived  in  detail  as  follows.  Equation  (I -22) 


can  be  written 


n-1  . . .n-j+1  . nn+l 

n„(«)  = s n.j  (J)  n^c)  + n *„(•)  • 


j=l  2 


,n  n ' 


where  we  define 


ni 5 \ <a>  - p <a> 

For  n = 2 and  n - 3,  Eq.  (B-  1)  reduces  to 

n2(°)  = j llj  (o)  - I2  (a)] 


n3(a)s:  | Ui(a) -2  I2(o)  + 13{a>]  (B 

We  now  prove  by  induction  that 

n = ( -i  )i+1  ("”/)  !i  («) 

n 2n  i=  1 i-i  i 

is  valid  for  all  n.  Note  first  that  by  Eqs.  (B-2)  and  (B-3),  Eq.  (4)  obtains  for  n = 1,  2,  and  3. 
Assume  that  Eq.  (4)  obtains  for  all  j ^ n — 1.  Then 

n"(a)  " Is  "=1  =i(-1>n+H  i (X!>  !i(a)  + (-1>n+1  7 Va> 

- js  "f,1  [l.1  (-i)n+i'j(";11)(|:i)]  it(«)  + (-nn+1  ± in(«)  • (b 

Then,  it  follows  from  the  relation 


V (-i)n+i'j("-1)(|-|)  = (j1'1)  z1  (-1  )n+i'j  (?'/) 

j=i  J 1 1-1  11  j=i  J 1 


= «";1I)nz 1 ( -i  )n-r(n*i>  = ( -i  >i+1  (T-i1) 


n_  (°) = z ( -i  >i+1  (""j1)  iiC«>  . 
n 2 i=l  l-il 

which  completes  the  induction. 

It  can  be  shown  in  just  the  same  way  that  Eq.  (7)  implies  Eq.  (9). 
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